Method and device for distributing objects in a heterogeneous group of data storage devices

ABSTRACT

Method for distributing objects in a heterogeneous group of data storage devices includes breaking down each object into a plurality of blocks, distributing the blocks in different storage devices in accordance with a distribution law which consists in distributing in each of the storage devices pieces of objects consisting each of one or of a plurality of blocks. The parameter used for managing the distribution of the pieces of objects is a flexibility coefficient CF(i), representing the difference between the weights of the pieces of the object (i). Values representing the variability in the popularity of each object are periodically measured and calculated; for each object, a desired flexibility coefficient CFv(i) to be assigned to the object is calculated from said values measured or calculated at instant (t), in accordance with a principle.

The invention relates to a method and a device for distributing objectsin a heterogeneous group of data storage devices.

The present tendency in the field of data storage devices is to storeinformation in increasingly complex storage systems, calledheterogeneous storage systems because they consist of storage deviceswhich do not have the same storage capacity (the quantity of informationwhich can be stored in a storage device) or the same bandwidth (the rateat which the information can be read from or written to a storagedevice).

Conventionally, the first step towards the storage of information in thedifferent storage devices consists, at the present time, in dividingeach object into a plurality of blocks and in distributing said blocksin the different storage devices according to a distribution lawconsisting in the distribution in each of the storage devices of piecesof objects, each consisting of one or a plurality of blocks.

For the sake of clarity, the following terminology is used throughoutthe text (description and claims):

-   -   storage device: device for recording or retrieving data;    -   client: device which accesses the data stored in the storage        devices;    -   popularity of a set of data: value proportional to the quantity        of data transferred per unit of time between the storage devices        and the clients when this set of data is accessed;    -   variability in popularity: value characterizing the variations        in popularity of a set of data over time;    -   block: set of data stored contiguously in a storage device;    -   object: set of blocks having comparable variability of        popularity;    -   piece of an object: set of blocks all belonging to the same        object and placed in the same storage device;    -   load of a storage device: set of data transferred between this        storage device and the clients per unit of time;    -   weight of a piece: value representing the workload due to this        piece in the storage device which stores this piece. This value        can be assessed in a number of ways, for example as the size of        the piece divided by the mean bandwidth of the storage device,        as the mean time taken by the storage device to process the read        or write requests for the data in this piece, and so on.

All developments up to the present time have used distribution methodsbased on two major principles.

The first method, which can be called the equitably distributedplacement method, is based on the principle of distributing in eachstorage device pieces of objects whose size is proportional to thebandwidth of said storage device. This method therefore tends to fullyexploit the bandwidth of the storage devices.

However, the drawback of this method is that some storage devices becomefull more rapidly than others, and that, consequently, as soon as astorage device is full, it is no longer possible to add pieces in theother storage devices without unbalancing the placement. Consequently,in most cases, this method does not enable the storage capacity of aheterogeneous group to be exploited fully. Although this loss of storagecapacity is difficult to evaluate, since it is a function of theconfiguration, it has been shown that it is generally greater than 25%and can be as much as almost 80% of the storage capacity.

The second method, which can be called non-equitably distributedplacement, is based on the principle of distributing the blocks ofobjects in such a way as to fully exploit the storage capacity of thegroup of storage devices, while attempting to balance the workload onthese storage devices. However, the major drawback of this method isthat this balance of the workload is unstable, owing to the variationsin popularity of the objects which tend to perturb this balance.

Most of the work done on this method has therefore consisted inproposing solutions for rebalancing the workload. However, thisrebalancing consumes bandwidth, and, in practice, leads to a reductionin the number of accesses which the storage devices can handlesimultaneously.

In conclusion, this second method appears to enable the storage capacityto be exploited fully, but leads to a significant decrease inexploitation of the bandwidth (of the order of 50%, varying according tothe rebalancing methods developed).

Up to the present time, all work done on methods for placing objects inheterogeneous groups of storage devices has aimed to improve one orother of the aforesaid placement methods, namely equitably distributedplacement and non-equitably distributed placement.

However, because of the specific drawbacks linked to the design of eachof these placement methods, all of this work has resulted in solutionsin which either the storage capacity is not fully exploited or thebandwidth is exploited at a low level.

The present invention proposes to mitigate this drawback, and itsprincipal object is to provide a method of distributing objects inheterogeneous storage devices which leads to the optimization of boththe exploitation of storage capacity and the exploitation of thebandwidth of said storage devices.

For this purpose, the invention proposes a method of distributingobjects in a heterogeneous group of storage devices, which uses as theparameter for managing the distribution of pieces of objects acoefficient, called the flexibility coefficient CF(i), representative ofthe difference between the weights of the pieces of the object (i), whenany new object is added, distributing the blocks of objects in saidstorage devices by specifying a given value of the flexibilitycoefficient, and for objects placed in the group of storage devices:

-   -   periodically measuring and calculating valves representative of        the variability in popularity of each object,    -   for each object, calculating from the aforesaid values measured        or calculated at the instant t, a desired flexibility        coefficient CFv(i) to be assigned to said object, according to a        principle consisting in assigning to each object a flexibility        coefficient inversely proportional to its variability in        popularity,    -   for each object (i), measuring and calculating at the instant t,        the real flexibility coefficient CFr(i) of said object,        representative of the difference between the weights of the        pieces of said object,    -   and commanding a movement of blocks of pieces of objects between        the storage devices so as to obtain, for each object (i), a real        flexibility coefficient CFr(i) corresponding to the desired        flexibility coefficient CFv(i) for this object.

The basic principle of the invention is therefore that of continuallymodifying, in a periodic way, the weights of the pieces of objectsinitially distributed in the different storage devices, using as theparameter for managing these movements a coefficient called theflexibility coefficient which is preferably inversely proportional tothe variability in popularity.

According to this principle, the flexibility coefficient generallyvaries between two extreme values CF min and CF max, determiningdifferent placement modes:

-   -   if CF(i)=CF min, the weights of all the pieces of the object (i)        are equal and the placement used is an equitably distributed        placement;    -   if CF(i)=CF max, the object (i) consists of only one piece        placed in a storage device, and the placement is therefore a        pure non-equitably distributed placement;    -   between these two extremes, the different values of CF(i) allow        the weights of the pieces to be modulated so as to obtain either        a better balance of the load by making CF(i) approach CF min, or        a better exploitation of the storage capacity by making CF(i)        approach CF max.

The invention therefore consists in a hybrid placement method whoseprincipal result is to place the objects having a stable popularitypreferentially according to the non-equitably distributed placementmethod (CF(i) close to CF max), and to place the objects having anunstable popularity preferentially according to the equitablydistributed placement method (CF(i) close to CF min).

In practice, simulations performed on groups of storage devices haveshown that the method according to the invention yielded a rate of useof more than 80% in storage capacity and more than 85% in bandwidth.

In an advantageous embodiment, the real flexibility coefficient CFr(i)calculated for each object (i) is such that:CFr(i)=Pdev(i)/Pmean(i)where: Pmean(i) is the mean of the weights of the pieces of the object(i),

-   Pdev(i) is the standard deviation with respect to the mean of the    weights of the pieces of the object (i).

In another advantageous embodiment:

-   -   in a preliminary stage, predetermined values of variability in        popularity are assigned to the different blocks, said blocks are        classified in decreasing order of variability in popularity, and        the objects are created by the association of contiguously        classified blocks;    -   and, in the course of the management of the group of storage        devices, the blocks are reclassified periodically in decreasing        order of variability in popularity according to the measured        information on the popularity and variability of said blocks.

Since the method according to the invention consists, in particular, ofcalculating the ideal flexibility coefficient for each object (i)(desired flexibility coefficient), and therefore consists of working onobjects, this advantageous embodiment, which results in the creation ofobjects consisting of blocks having very similar variability inpopularity, enables the work on each of said objects to be doneefficiently.

Additionally, and advantageously, for each object (i) the desiredflexibility coefficient CFv(i) is determined by means of a decisionmethod called the “decision trapezium”, consisting of a trapezium havinga first base consisting of a graduated axis whose vector is thevariability in popularity of said object, and a second base consistingof a graduated axis whose vector is the desired flexibility coefficientfor said object, said vector having the opposite direction to thepreceding one, said decision method comprising the steps of:

-   -   specifying the ceiling and floor values of each of the        variables, namely the variability in popularity and the desired        flexibility coefficient, so as to produce a trapezium whose two        sides consist of the segments [var ceiling-CF floor] and [var        floor-CF ceiling] respectively,    -   if the variability is greater than or equal to var ceiling,        projecting this variability onto the value CF floor of the        flexibility coefficient;    -   similarly, if the variability is less than or equal to var        floor, projecting this variability onto the value CF ceiling;    -   if the variability lies in the interval [var floor-var ceiling],        linearly projecting this variability onto the interval [CF        floor-CF ceiling].

This decision trapezium forms a decision tool which enables the desiredflexibility coefficient for an object to be determined instantaneouslyfrom the variability in popularity of the object.

Additionally, this decision trapezium offers a flexibility of behaviordue to the fact that the global behavior can be modified in accordancewith the choices of the ceiling and floor values of variability inpopularity and desired flexibility coefficients.

In another advantageous embodiment, and in order to obtain for eachobject (i) a real flexibility coefficient CFr(i) corresponding to thedesired flexibility coefficient CFv(i):

-   -   a parameter called the CF distance, equal to the absolute value        |CFv(i)−CFr(i)|, is calculated for each object (i);    -   for the objects (i) for which CFv(i)<CFr(i), and in the first        place for the objects for which the CF distance is maximum, a        block of these objects belonging to the piece of highest weight        is moved towards the storage device containing the piece of        lowest weight;    -   and, for the objects (i) for which CFv(i)>CFr(i), and in the        first place for the objects for which the CF distance is        maximum, a block of these objects is moved from the fullest        storage device towards the least full storage device.

This mode of implementation, intended to impart to each object a realflexibility coefficient equal to the desired flexibility coefficient,results in a gain in bandwidth for objects whose real flexibilitycoefficient is decreasing, and a gain in storage capacity for objectswhose real flexibility coefficient is increasing.

Additionally, this mode of implementation, which can be divided into twoseparate reconfiguration processes, according to whether CFv(i) isgreater than or less than CFr(i), enables the desired resultCFr(i)≅CFv(i) to be obtained with the use of a small percentage of thebandwidth of the storage device.

Moreover, the two aforesaid reconfiguration processes have separateworking spaces and therefore cannot interfere with each other.

In another advantageous embodiment intended to mitigate any imbalance inthe workload due to imperfections in the management of the distributionof the blocks of objects:

-   -   the most loaded storage device, called the hot SD, and the least        loaded storage device, called the cold SD, are selected;    -   an infrequently accessed block, called a cold block, is found in        the least loaded storage device, and a frequently accessed        block, called a hot block, is found in the most loaded storage        device, so that these two blocks can be exchanged to rebalance        the load;    -   the hot blocks in the hot SD are sorted as follows:    -   the objects are classified in decreasing order of mean        popularity,    -   after this classification, n most popular objects, forming a        region called the high-popularity region, are selected,    -   the n objects are broken down into blocks which are processed        individually to sort them and distribute them in four classes:        -   class 1: blocks for which the movement from the hot SD            towards the cold SD decreases the real flexibility            coefficient of the objects to which they belong,        -   class 2: blocks moved towards a storage device other than            the fullest storage device,        -   class 3: blocks moved towards the fullest storage device,        -   class 4: blocks whose movement increases the CF distance,    -   in the same way, in order to sort the cold blocks in the cold        SD, the n least popular objects are selected after        classification, these objects forming a region called the region        of low popularity, and are broken down into blocks which are        processed individually to sort them and distribute them in the        aforesaid four classes,    -   after which, starting from these two groups of sorted blocks,        namely the groups of hot blocks and cold blocks:        -   the block in the hot SD to be transferred, consisting of a            previously sorted block which is located in said hot SD and            whose class number is lowest, is found,        -   the block in the cold SD to be transferred, consisting of a            previously sorted block which is located in said cold SD and            whose class number is lowest, is found,        -   and the selected block in the hot SD is exchanged with the            selected block in the cold SD.

This “rebalancing” is thus designed to implement operations consistingin the exchange of the infrequently accessed blocks located in the lessloaded storage devices with the frequently accessed blocks located inthe highly loaded storage devices.

This rebalancing process thus enables the imbalances in workload to bereconfigured, while avoiding, or at least minimizing, any problem ofinterference with the reconfiguration processes.

The invention includes a device for applying the method according to theinvention. The invention therefore relates to a device for distributingobjects broken down into a plurality of blocks in a heterogeneous groupof storage devices according to a distribution law consisting indistributing in said storage devices, of pieces of objects eachconsisting of a block or a plurality of blocks, said distribution devicecomprising:

-   -   a module, called the analysis module, adapted for periodically        measuring and calculating values representative of the        variability in popularity of each object;    -   a module, called the decision module, adapted for calculating        for each object (i), from the parameters measured and calculated        by the analysis module, at an instant t, a coefficient CFr(i),        called the real flexibility coefficient, representative of the        difference between the weights of the pieces of this object (i),        and a coefficient CFv(i), called the desired flexibility        coefficient, to be assigned to the object (i), and calculated        according to a principle consisting in assigning to each object        a flexibility coefficient inversely proportional to its        variability in popularity;    -   and a module, called the reconfiguration module, adapted for        commanding a movement of blocks of pieces of objects between the        storage devices, so as to obtain for each object (i) a real        flexibility coefficient CFr(i) corresponding to the desired        flexibility coefficient CFv(i) for this object.

Additionally, and advantageously, this distribution device comprises amodule, called the rebalancing module, adapted for carrying outoperations consisting in exchanging infrequently accessed blocks locatedin less loaded storage devices with frequently accessed blocks locatedin highly loaded storage devices.

Other characteristics, objects and advantages of the invention will bemade clear by the following detailed description which refers to theattached drawings which represent a preferred embodiment of theinvention by way of example and without restrictive intent. In thesedrawings,

FIG. 1 is a synoptic diagram of a device according to the invention,

FIG. 2 is a diagram representing a mode of determining the desiredflexibility coefficient CFv(i) for each object (i),

and FIG. 3 is a graph representing the storage capacity exploited andthe bandwidth exploited for a heterogeneous group of storage devicesmanaged by a method according to the invention.

The device according to the invention shown in FIG. 1 is intended forthe management of the placement of objects in a heterogeneous storagesystem 1 which conventionally comprises a controller 2 for managing saidsystem, linked by an interconnection to a plurality of storage devices3, 4.

This device comprises, in the first place, an analyzer 5 adapted tocalculate, from measurements made on the heterogeneous system 1, thequality of the load balancing, the variability in popularity of theobjects, the occupation of the storage capacity of the system, and themaintenance of service quality. It sends these estimates to thedecision-making part 6 of an object flexibility controller 9.

The decision maker 6 monitors the variations in the estimates suppliedby the analyzer 5 in such a way that a situation of overload or pooroccupation of storage capacity can be rectified. It is designed todecide, for example, to decrease the flexibility of all the objectsstored in the storage system 1 if the balance of the load is found to bedegraded and thus to be leading to poor service quality. These decisionsare sent to a statement of flexibility of the objects 7.

This flexibility statement 7 stores, for each object (i) stored in thestorage system 1, its desired flexibility coefficient CFv(i), togetherwith the real flexibility coefficient CFr(i). The flexibility statement7 is updated according to the decisions of the decision maker 6 (whichcause the modification of the desired flexibility coefficient) andaccording to the actions of the reconfiguration unit 8 described below(which cause the modification of the real flexibility coefficient).

The device also comprises, as mentioned above, a reconfiguration unit 8adapted to move the data (by issuing input/output requests) in such away that the real flexibility coefficient of the objects approaches thedesired value set by the decision maker 6.

However, it should be noted that the decisions of the flexibilitycontroller 9 cannot be perfect, for the following reasons:

-   -   The analyzer 5 cannot make completely accurate and instantaneous        measurements.    -   The popularities are constantly changing: the work done by the        flexibility controller 9 at any instant t is intended to resolve        the problems of a situation measured at the instant t−1. This        work will actually affect the performance of the system 1 at the        instant t+1. If the popularities have changed between the        instants t−1 and t+1, the flexible placement will not have an        optimal effect, and therefore some of the load will be        unbalanced.    -   There are no objects whose popularity remains constant.        Therefore, as soon as a placement ceases to be equitably        distributed for all the objects, an imbalance occurs.

This is why the flexible placement requires the support of a dynamicrebalancer 11. But, by contrast with what happens in the case ofnon-equitably distributed placement, the rebalancer has very little workto do: it only has to make up for the imbalance caused by theimperfections of the flexibility controller 9.

The quality of the flexible placement is therefore subject to theperformance of four mechanisms, namely the decision maker 6, thereconfiguration unit 8, the rebalancer 11 and the analyzer 5.

To achieve this placement quality, these mechanisms must be designed sothat:

-   -   they assign the correct flexibility coefficient to each object:        this is because, if flexibility coefficients opposite to the        required ones are used, the results may be considerably worse        than those of non-equitably distributed placement;    -   they limit the reaction time of the flexibility controller to a        minimum: the analyzer 5 detects changes in the way in which the        system 1 reacts (period d1), then the decision maker 6 assigns        the desired flexibility coefficients (period d2), then the        reconfiguration unit 8 moves the data to provide a match between        the desired flexibility coefficients and the real flexibility        coefficients (period d3). d2 is very short, d1 is moderately        short (some analyses require the monitoring of the system 1 for        a certain time if they are to be appropriate), and d3 can be        long if many reconfiguring movements are required. As d1+d2+d3        increases in length, the load becomes more unbalanced.

In order to meet these objectives, the operation of each of the hybridplacement mechanisms is described in detail below.

In the first place, the decision maker 6 processes the parametersmeasured and/or calculated by the analyzer 5, using the followingcriteria:

-   -   exploitation of the storage capacity: the free storage capacity        should be well distributed, in other words its arrangement        should maximize the quantity of data which can be added to the        system 1 (given the initial flexibility coefficient of each        newly stored object);    -   variability in popularity: the higher the variability of an        object, the greater is the risk of unbalancing the load;    -   meeting the constraints of service quality: if the service        quality constraints associated with an object are not met (for        example, if a display screen is not served with a sufficient        data rate, causing flickering), the service quality must be        stabilized by decreasing the flexibility coefficient of this        object, in such a way that it is equitably distributed and thus        benefits from greater bandwidth. If the service quality of a set        of objects is chaotic, it can be stabilized by globally        decreasing the flexibility coefficient for all the objects;    -   quality of load balancing: if the global balance is poor, the        global flexibility coefficient of the data must be decreased.

The decision maker 6 is therefore designed to calculate the idealflexibility coefficient for each object (i), which will be the desiredflexibility coefficient CFv(i).

In the first place, when the decision maker 6 is working on objects, andin order to optimize the decision mechanism, it is desirable for eachobject to consist of blocks having variabilities in popularity of thesame order of magnitude.

For this purpose, values of variability in popularity, determined in arandom way or as a function of measurements obtained from previousinvestigation, are assigned to the blocks in a preliminary step, andsaid blocks are classified in decreasing order of variability inpopularity.

The objects are then created by the association of contiguouslyclassified blocks. By way of example, the first object can consist ofthe first n classified blocks, the second object can consist of thefollowing n blocks, and so on.

When this object creation has been completed, the system is started up,and the blocks are periodically reclassified in decreasing order ofvariability in popularity, according to the information on popularityand variability of the blocks acquired by the analyzer 5.

Similarly, the decision maker 6 works on objects whose content variesconstantly, especially in an initial stabilization step, said objectsconsisting of blocks which can contain disparate data belonging todifferent files, but having very similar variability in popularity,enabling the decision maker 6 to work efficiently on each of theobjects.

The decision method, which enables the desired flexibility coefficientto be determined for each object as a function of the variability inpopularity monitored by the analyzer 5, is in turn based on the use of a“decision trapezium” shown in FIG. 2.

One of the bases of this trapezium consists of a graduated axis whosevector is the variability in popularity, while the second base consistsof a graduated axis whose vector is the desired flexibility coefficient,said vector having a direction opposite that of the preceding one.

The first step of this decision method consists in specifying theceiling and floor values of each of the variables, namely thevariability in popularity and desired flexibility coefficient, so as toproduce a trapezium shown in FIG. 2, whose two sides consist,respectively, of the segments [var ceiling-CF floor] and [var floor-CFceiling].

When this decision trapezium has been defined, the decision methodconsists in the following steps:

-   -   if the variability is greater than or equal to var ceiling, this        variability is projected onto the value CF floor of the        flexibility coefficient; in other words, the desired flexibility        coefficient of an object whose variability is greater than or        equal to var ceiling is set at the value of CF floor;    -   similarly, if the variability is less than or equal to var        floor, this variability is projected onto the value CF ceiling;    -   if the variability lies in the interval [var floor-var ceiling],        this variability is projected linearly onto the interval [CF        floor-CF ceiling].

This decision trapezium thus forms a decision tool which enables thedesired flexibility coefficient for an object to be determinedinstantaneously from the variability in popularity of the object.

Additionally, the behavior of the decision maker 6 can be modifiedaccording to the choice of the ceiling and floor values of variabilityin popularity and desired flexibility coefficient.

Thus, by way of example:

-   -   an increase of the values CF floor and CF ceiling (a shift to        the right in FIG. 2) leads to a global increase in the desired        flexibility coefficients generated by the decision maker 6, and        consequently to a better exploitation of the storage capacity        with a penalty in terms of bandwidth;    -   conversely, a reduction of CF floor and CF ceiling leads to        optimization of the exploitation of the bandwidth of the storage        system, with a penalty in terms of its storage capacity; a        similar argument can be made in respect of the values var        ceiling and var floor;    -   a decrease of CF floor and an increase of CF ceiling (increase        of the interval between these limits) will cause greater        differentiation in the processing of the objects: the stable        objects will be more easily “flexibilized” (by an increase in        their real flexibility coefficient) and the variable objects        will be more easily “rigidified” (by a decrease in their real        flexibility coefficient). Consequently, the power of the        flexibility controller 9 is increased. Conversely, this power of        the flexibility controller 9 can also be decreased by bringing        the two limits of the interval of flexibility coefficients        closer together.

The decision trapezium thus enables the flexibility coefficient to beassigned to each object to be determined case by case, while alsoenabling the global policy for all the objects to be modified accordingto events intervening in the operation of the storage system 1.

Thus, for example, if a storage device 3 or 4 of the storage system 1fails, some of the bandwidth is lost and the service quality cantherefore no longer be maintained. In this case, the decision trapeziumcan be modified to gain bandwidth, with a penalty in terms of storagecapacity, so as to re-establish acceptable service quality temporarilywhile the defective storage device is replaced.

Secondly, the reconfiguration unit 8 is adapted to act in such a waythat the real flexibility coefficient CFr(i) of each object (i) is equalto the desired flexibility coefficient CFv(i) determined by the decisionmaker 6, in order to gain bandwidth for objects whose real flexibilitycoefficient decreases, and to gain storage capacity for objects whosereal flexibility coefficient increases.

For this purpose, in the first place, the operation method of thisreconfiguration unit 8 uses a parameter called the “CF distance”, equalto the absolute value |CFv(i)−CFr(i)|.

This operating method can be divided into two processes:

-   -   a reconfiguration process, called BW reconfiguration, which is        designed to gain bandwidth and is applied to the objects whose        desired CF is less than the real CF, and which therefore need to        be rigidified; for this purpose, this process consists in        moving, for the object to be rigidified, a block belonging to        the piece having the highest weight towards the storage device        containing the piece having the lowest weight; this process is        also designed to process, in the first place, the objects whose        CF distance is largest;    -   a reconfiguration process, called SC reconfiguration, which is        designed to gain storage capacity, and is applied to the objects        whose desired CF is greater than the real CF, and which        therefore need to be flexibilized; for this purpose, this        process consists in selecting the objects having a maximal CF        distance and moving a block of these objects from the fullest        storage device towards the least full storage device.

This choice of two separate processes makes it possible to exploit thedecision capacities of the decision maker 6 well, while using a smallpercentage of the bandwidth of the storage system. It should also benoted that the BW reconfiguration process works on objects which have tobe rigidified, whereas the SC reconfiguration process works on objectswhich have to be flexibilized. Consequently, these two processes haveseparate work spaces and therefore cannot interfere with each other.

Finally, the objective of the rebalancer 11, as mentioned above, is tomitigate the imbalances in workload arising from the imperfections ofthe flexibility controller 1.

The rebalancing method is designed to provide, for this purpose,operations consisting in exchanging infrequently accessed blocks (coldblocks) located in less loaded storage devices (cold SDs) withfrequently accessed blocks (hot blocks) located in highly loaded storagedevices (hot SDs).

This rebalancing method, being completely separate from thereconfiguration, has to be designed to avoid or at least minimize anyproblem of interference with the reconfiguration processes, particularlythe BW reconfiguration process, and therefore operates in a special way.

This rebalancing method requires a special classification method whichchanges according to the information transmitted by the analyzer 5, andwhich is explained below.

In the first place, the most highly loaded storage device (hot SD) andthe least loaded storage device (cold SD) are selected.

In order to move a hot block from the hot SD towards the cold SD, theobjects are classified in decreasing order of mean popularity, and the nmost popular objects are selected after this classification, where n ispredetermined and is for example equal to 25, forming a “high region” ofpopularity.

These selected objects are then broken down into blocks which are thenprocessed individually by being sorted and distributed in the followingfour classes:

-   -   class 1: blocks whose movement from the hot SD towards the cold        SD decreases the real flexibility coefficient of the objects to        which they belong, thus leading to a rigidification of these        objects. This class 1 therefore relates to the blocks which        create no risk of conflict with the BW reconfiguration;    -   class 2: blocks moved towards a storage device other than the        fullest storage device (in other words, when the cold SD differs        from the fullest storage device); this class 2 therefore relates        to the blocks which create no risk of conflict with the SC        reconfiguration;    -   class 3: blocks moved towards the fullest storage device (the        class opposite to class 2), and therefore capable of creating a        risk of conflict with the SC reconfiguration;    -   class 4: blocks whose movement increases the CF distance (the        class opposite to class 1), and which may create risks of        conflict with the BW reconfiguration.

In the same way, in order to process the movement of a cold block fromthe cold SD towards the hot SD, an identical classification into thesame four classes is carried out for the blocks of the n least popularobjects forming a “low popularity” region.

After this classification, the steps of the balancing method are asfollows:

-   1) Finding the block in the hot SD to be transferred, this block    belonging to an object classified in the high popularity region. For    this purpose, the block having the lowest class number (class 1    taking priority), and whose object has a minimal CF distance, in the    hot SD is chosen. It should be noted that this choice of a minimal    CF distance leads, in particular, to a decrease in the risk of    conflict with the SC reconfiguration for the blocks located in class    3, and with the BW reconfiguration for the objects located in class    4, since these reconfigurations lead to the movement of virtually    only the data of the objects having a large CF distance.-   2) Finding the block in the cold SD to be transferred, this block    belonging to an object classified in the low popularity region (the    finding process being similar to that of step 1) above).-   3) Exchanging the block in the hot SD selected in step 1) above with    the block in the cold SD selected in step 2) above.

The design of the flexibility controller 9, leading to “flexibleplacement”, combined with this rebalancing process, leads tosimultaneous optimization of the exploitation of the storage capacityand the exploitation of the bandwidth of the storage systems, as shownin the example of use described below with reference to FIG. 3.

The storage system 1 which was used consisted of two hard disks having astorage capacity of 18.6 GB and a bandwidth of 34.2 MB/s, and two harddisks having a storage capacity of 43 GB and a bandwidth of 28.8 MB/s.

In FIG. 3, which illustrates the results, the operating time in minutesis shown on the abscissa, while the rates of exploitation of thebandwidth and of the storage capacity are shown on the ordinate.

At the start of operation, the storage system 1 is filled in a randomway. As shown by the curve of FIG. 3, during an initial stabilizationphase the flexibility controller 9 tends to release storage capacitywhile stabilizing the bandwidth as well as possible. Subsequently, theinformation supplied by the analyzer 5 become progressively moreaccurate and more representative of the characteristics of the objects,enabling the decision maker 6 to calculate desired flexibilitycoefficients which are increasingly appropriate.

For each object, these desired flexibility coefficients also graduallyapproach the values of the real flexibility coefficients, as a result ofthe action of the reconfiguration unit 7.

At the end of this stabilization phase, the flexibility controller 9provides a cruising regime in which a 77% rate of use of the storagecapacity and an 89% rate of use of bandwidth are achieved.

By way of comparison, an “equitably distributed” placement provides, inthe same conditions, a 100% rate of use of the bandwidth, but only a 35%rate of use of the storage capacity.

Consequently, by comparison with the known placement methods, the“flexible interlacing” according to the invention provides high rates ofuse of both bandwidth and storage capacity, and does not lead to a“sacrifice” of one of these rates of use in favor of the other.

1. A method of distributing objects in a heterogeneous group of datastorage devices, the method comprising the steps of: breaking down eachobject into a plurality of blocks, distributing said blocks in thedifferent storage devices in accordance with a distribution law whichconsists in distributing, in each of said storage devices, pieces ofobjects consisting each of one or of a plurality of blocks, the saidmethod being characterized in that it uses as the parameter for managingthe distribution of pieces of objects a coefficient, called theflexibility coefficient CF(i), representative of the difference betweenthe weights of the pieces of the object (i), and characterized in thatit comprises the steps of: when any new object is added, distributingthe blocks of objects in said storage devices by specifying a givenvalue of the flexibility coefficient, and for objects placed in thegroup of storage devices: periodically measuring and calculating valuesrepresentative of the variability in popularity of each object, for eachobject, calculating from the aforesaid values measured or calculated atthe instant t, a desired flexibility coefficient CFv(i) to be assignedto said object, according to a principle consisting in assigning to eachobject a flexibility coefficient inversely proportional to itsvariability in popularity, for each object (i), measuring andcalculating at the instant t, the real flexibility coefficient CFr(i) ofsaid object, representative of the difference between the weights of thepieces of said object, and commanding a movement of blocks of pieces ofobjects between the storage devices so as to obtain, for each object(i), a real flexibility coefficient CFr(i) corresponding to the desiredflexibility coefficient CFv(i) for this object.
 2. The distributionmethod as claimed in claim 1, wherein the real flexibility coefficientCFr(i) calculated for each object (i) is such that:CFr(i)=Pdev(i)/Pmean(i) where: Pmean(i) is the mean of the weights ofthe pieces of the object (i), and Pdev(i) is the standard deviation withrespect to the mean of the weights of the pieces of the object (i). 3.The distribution method as claimed in claim 1, wherein: in a preliminarystage, predetermined values of variability in popularity are assigned tothe different blocks, said blocks are classified in decreasing order ofvariability in popularity, and the objects are created by theassociation of contiguously classified blocks; and, in the course of themanagement of the group of storage devices, the blocks are reclassifiedperiodically in decreasing order of variability in popularity accordingto the measured information on the popularity and variability of saidblocks.
 4. The distribution method as claimed in claim 1, wherein thedesired flexibility coefficient CFv(i) is determined for each object (i)by means of a decision method called the “decision trapezium”,consisting of a trapezium having a first base consisting of a graduatedaxis whose vector is the variability in popularity of said object, and asecond base consisting of a graduated axis whose vector is the desiredflexibility coefficient for said object, said vector having the oppositedirection to the preceding one, said decision method comprising thesteps of: specifying the ceiling and floor values of each of thevariables, namely the variability in popularity and the desiredflexibility coefficient, so as to produce a trapezium whose two sidesconsist of the segments [var ceiling-CF floor] and [var floor-CFceiling] respectively, if the variability is greater than or equal tovar ceiling, projecting this variability onto the value CF floor of theflexibility coefficient; similarly, if the variability is less than orequal to var floor, projecting this variability onto the value CFceiling; if the variability lies in the interval [var floor-varceiling], linearly projecting this variability onto the interval [CFfloor-CF ceiling].
 5. The distribution method as claimed in claim 1,wherein, in order to obtain for each object (i) a real flexibilitycoefficient CFr(i) corresponding to the desired flexibility coefficientCFv(i): a parameter called the CF distance, equal to the absolute value|CFv(i)−CFr(i)|, is calculated for each object (i); for the objects (i)for which CFv(i)<CFr(i), and in the first place for the objects forwhich the CF distance is maximum, a block of these objects belonging tothe piece of highest weight is moved towards the storage devicecontaining the piece of lowest weight; and, for the objects (i) forwhich CFv(i)>CFr(i), and in the first place for the objects for whichthe CF distance is maximum, a block of these objects is moved from thefullest storage device towards the least full storage device.
 6. Thedistribution method as claimed in claim 1, wherein: the most loadedstorage device, called the hot SD, and the least loaded storage device,called the cold SD, are selected; an infrequently accessed block, calleda cold block, is found in the least loaded storage device, and afrequently accessed block, called a hot block, is found in the mostloaded storage device, so that these two blocks can be exchanged torebalance the load; the hot blocks in the hot SD are sorted as follows:the objects are classified in decreasing order of mean popularity, afterthis classification, n most popular objects, forming a region called thehigh-popularity region, are selected, the n objects are broken down intoblocks which are processed individually to sort them and distribute themin four classes: class 1: blocks for which the movement from the hot SDtowards the cold SD decreases the real flexibility coefficient of theobjects to which they belong, class 2: blocks moved towards a storagedevice other than the fullest storage device, class 3: blocks movedtowards the fullest storage device, class 4: blocks whose movementincreases the CF distance, in the same way, in order to sort the coldblocks in the cold SD, the n least popular objects are selected afterclassification, these objects forming a region called the region of lowpopularity, and are broken down into blocks which are processedindividually to sort them and distribute them in the aforesaid fourclasses, after which, starting from these two groups of sorted blocks,namely the groups of hot blocks and cold blocks: the block in the hot SDto be transferred, consisting of a previously sorted block which islocated in said hot SD and whose class number is lowest, is found, theblock in the cold SD to be transferred, consisting of a previouslysorted block which is located in said cold SD and whose class number islowest, is found, and the selected block in the hot SD is exchanged withthe selected block in the cold SD.
 7. A device for distributing objects(i) broken down into a plurality of blocks in a heterogeneous group (1)of storage devices (3, 4) according to a distribution law consisting indistributing, in said storage devices, pieces of objects each consistingof a block or a plurality of blocks, said distribution device beingcharacterized in that it comprises: a module, called the analysis module(5), adapted for periodically measuring and calculating valuesrepresentative of the variability in popularity of each object; amodule, called the decision module (6), adapted for calculating for eachobject (i), from the parameters measured and calculated by the analysismodule (5), at a instant t, a coefficient CFr(i), called the realflexibility coefficient, representative of the difference between theweights of the pieces of this object (i), and a coefficient CFv(i),called the desired flexibility coefficient, to be assigned to the object(i), and calculated according to a principle consisting in assigning toeach object a flexibility coefficient inversely proportional to itsvariability in popularity; and a module, called the reconfigurationmodule (8), adapted for commanding a movement of blocks of pieces ofobjects between the storage devices, so as to obtain for each object (i)a real flexibility coefficient CFr(i) corresponding to the desiredflexibility coefficient CFv(i) for this object.
 8. The distributiondevice as claimed in claim 7, wherein it comprises a module, called therebalancing module (11), adapted for carrying out operations consistingin exchanging infrequently accessed blocks located in less loadedstorage devices with frequently accessed blocks located in highly loadedstorage devices.